hpgltools was written to make working with high-throughput data analyses easier. These analyses generally fall into a few stages:
Before any of these tasks may be performed, the data must be loaded into memory. hpgltools attempts to make this easier with create_expt() and subset_expt().
The following examples will use a real data set from a recent experiment in our lab. The raw data was processed using a mix of trimmomatic, biopieces, bowtie, samtools, and htseq. The final count tables were deposited into the ‘preprocessing/count_tables/’ tree. The resulting data structure was named ‘most_v0M1,’ named because it is comprised of count tables with 0 mismatches and 1 randomly-placed multi-match.
The annotation file was mgas_5005.gff.xz resides in ‘reference/gff/’.
The count tables and meta-data were loaded through the create_expt() function and the genome annotations were loaded with gff2df().
library(hpgltools)
data_file <- system.file("cdm_expt.rda", package="hpgltools")
cdm <- new.env()
load(data_file, envir=cdm)
rm(data_file)
ls()## [1] "cdm" "rmd_file"
## 2 variables should exist now: rmd_file in case I want to knitr this file, cdm which is a list including the data required to make an expressionset.
## expt <- create_expt(metadata='filename.xlsx', gene_info=annotation_data)
expt <- create_expt(count_dataframe=cdm$cdm_counts, metadata=cdm$cdm_metadata, gene_info=cdm$gene_info)## Bringing together the count matrix and gene information.
## Loading required namespace: Biobase
## The gff information is in 'annotations'
## The experiment is in most_v0M1
## Here is the meta-data! (well, the first 6 lines anyway).
knitr::kable(head(expt$design))| sampleid | type | stage | replicate | mutantname | media | exptdate | libdate | batch | condition | readspassed | ncrna | xncrna | remaining | genome | xgenome | otherbacterial | xother | colors | counts | intercounts | design | file | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HPGL0418 | HPGL0418 | WT | LL | 1 | 5448-1 | CDMg | 20130430 | 20140409 | a | wt_ll_cg | 17425675 | 350440 | 2.01% | 17075235 | 16061860 | 94.07% | 356542 | 2.09% | #E12C8C | processed_data/count_tables/HPGL0418/HPGL0418_forward-trimmed-v1M1l20.count.xz | unknown | a | null |
| HPGL0419 | HPGL0419 | WT | LL | 2 | 5448-2 | CDMg | 20130430 | 20140409 | b | wt_ll_cg | 18106361 | 398984 | 2.20% | 17707377 | 16619183 | 93.85% | 392803 | 2.22% | #E12C8C | processed_data/count_tables/HPGL0419/HPGL0419_forward-trimmed-v1M1l20.count.xz | unknown | b | null |
| HPGL0420 | HPGL0420 | mga | LL | 1 | Mga-1 | CDMg | 20130430 | 20140409 | a | mga1_ll_cg | 20499936 | 417440 | 2.04% | 20082496 | 18062942 | 89.94% | 416260 | 2.07% | #BE5067 | processed_data/count_tables/HPGL0420/HPGL0420_forward-trimmed-v1M1l20.count.xz | unknown | a | null |
| HPGL0421 | HPGL0421 | mga | LL | 2 | Mga-2 | CDMg | 20130430 | 20140409 | b | mga1_ll_cg | 17216381 | 397021 | 2.31% | 16819360 | 14795703 | 87.97% | 376295 | 2.24% | #BE5067 | processed_data/count_tables/HPGL0421/HPGL0421_forward-trimmed-v1M1l20.count.xz | unknown | b | null |
| HPGL0422 | HPGL0422 | WT | LL | 1 | 5448-1 | CDMf | 20130430 | 20140415 | a | wt_ll_cf | 17112706 | 367791 | 2.15% | 16744915 | 15114930 | 90.27% | 367974 | 2.20% | #8E7E40 | processed_data/count_tables/HPGL0422/HPGL0422_forward-trimmed-v1M1l20.count.xz | unknown | a | null |
| HPGL0423 | HPGL0423 | WT | LL | 2 | 5448-2 | CDMf | 20130430 | 20140415 | b | wt_ll_cf | 18782729 | 395748 | 2.11% | 18386981 | 17330239 | 94.25% | 386443 | 2.10% | #8E7E40 | processed_data/count_tables/HPGL0423/HPGL0423_forward-trimmed-v1M1l20.count.xz | unknown | b | null |
summary(expt)## Length Class Mode
## title 1 -none- character
## notes 1 -none- character
## initial_metadata 23 data.frame list
## expressionset 1 ExpressionSet S4
## design 23 data.frame list
## conditions 8 -none- character
## batches 8 -none- character
## samplenames 8 -none- character
## colors 8 -none- character
## state 5 -none- list
## original_expressionset 1 ExpressionSet S4
## original_libsize 8 -none- numeric
## libsize 8 -none- numeric
## original_metadata 1 AnnotatedDataFrame S4
The data structure generated by create_expt() is a list containing the following slots:
One possibility would be to examine the data in its unmolested state:
raw_metrics <- sm(graph_metrics(expt, qq=TRUE))The function graph_metrics() performs all of the likely plots one might want. Some of which are not really appropriate for non-normalized data unless it is incredibly well behaved (after 30 years, I still want to spell behaved ‘behaived’, why is that?).
## View a raw library size plot
raw_metrics$libsize## Or boxplot to see the data distribution
raw_metrics$boxplot## Warning: Transformation introduced infinite values in continuous y-axis
## The warning is because it automatically uses a log scale and there are some 0 count genes.
## Perhaps you prefer density plots
raw_metrics$density## Warning: Transformation introduced infinite values in continuous x-axis
## quantile/quantile plots compared to the median of all samples
raw_metrics$qqrat## Here we can see some samples are differently 'shaped' compared to the median than others
## There are other plots one may view, but this data set is a bit too crowded as is.
## The following summary shows the other available plots:
summary(raw_metrics)## Length Class Mode
## nonzero 9 gg list
## libsize 9 gg list
## boxplot 9 gg list
## corheat 3 recordedplot list
## smc 9 gg list
## disheat 3 recordedplot list
## smd 9 gg list
## pcaplot 9 gg list
## pcatable 8 data.frame list
## pcares 4 data.frame list
## pcavar 7 -none- numeric
## density 9 gg list
## legend 2 -none- list
## qqlog 3 recordedplot list
## qqrat 3 recordedplot list
## ma 0 -none- NULL
The plots are all generated by calling plot_something() where the somethings are:
On the other hand, we might take a subset of the data to focus on the late-log vs. early-log samples.
The expt_subset() function allows one to pull material from the experimental design.
Once we have a smaller data set, we can more easily use PCA to see how the sample separate.
head(expt$design)## sampleid type stage replicate mutantname media exptdate libdate
## HPGL0418 HPGL0418 WT LL 1 5448-1 CDMg 20130430 20140409
## HPGL0419 HPGL0419 WT LL 2 5448-2 CDMg 20130430 20140409
## HPGL0420 HPGL0420 mga LL 1 Mga-1 CDMg 20130430 20140409
## HPGL0421 HPGL0421 mga LL 2 Mga-2 CDMg 20130430 20140409
## HPGL0422 HPGL0422 WT LL 1 5448-1 CDMf 20130430 20140415
## HPGL0423 HPGL0423 WT LL 2 5448-2 CDMf 20130430 20140415
## batch condition readspassed ncrna xncrna remaining genome
## HPGL0418 a wt_ll_cg 17425675 350440 2.01% 17075235 16061860
## HPGL0419 b wt_ll_cg 18106361 398984 2.20% 17707377 16619183
## HPGL0420 a mga1_ll_cg 20499936 417440 2.04% 20082496 18062942
## HPGL0421 b mga1_ll_cg 17216381 397021 2.31% 16819360 14795703
## HPGL0422 a wt_ll_cf 17112706 367791 2.15% 16744915 15114930
## HPGL0423 b wt_ll_cf 18782729 395748 2.11% 18386981 17330239
## xgenome otherbacterial xother colors
## HPGL0418 94.07% 356542 2.09% #E12C8C
## HPGL0419 93.85% 392803 2.22% #E12C8C
## HPGL0420 89.94% 416260 2.07% #BE5067
## HPGL0421 87.97% 376295 2.24% #BE5067
## HPGL0422 90.27% 367974 2.20% #8E7E40
## HPGL0423 94.25% 386443 2.10% #8E7E40
## counts
## HPGL0418 processed_data/count_tables/HPGL0418/HPGL0418_forward-trimmed-v1M1l20.count.xz
## HPGL0419 processed_data/count_tables/HPGL0419/HPGL0419_forward-trimmed-v1M1l20.count.xz
## HPGL0420 processed_data/count_tables/HPGL0420/HPGL0420_forward-trimmed-v1M1l20.count.xz
## HPGL0421 processed_data/count_tables/HPGL0421/HPGL0421_forward-trimmed-v1M1l20.count.xz
## HPGL0422 processed_data/count_tables/HPGL0422/HPGL0422_forward-trimmed-v1M1l20.count.xz
## HPGL0423 processed_data/count_tables/HPGL0423/HPGL0423_forward-trimmed-v1M1l20.count.xz
## intercounts design file
## HPGL0418 unknown a null
## HPGL0419 unknown b null
## HPGL0420 unknown a null
## HPGL0421 unknown b null
## HPGL0422 unknown a null
## HPGL0423 unknown b null
## elt stands for: "early/late in thy"
batch_a <- expt_subset(expt, subset="batch=='a'")
batch_b <- expt_subset(expt, subset="batch=='b'")
a_metrics <- graph_metrics(batch_a)## Graphing number of non-zero genes with respect to CPM by library.
## Graphing library sizes.
## Warning: Ignoring unknown aesthetics: parse
## Graphing a boxplot.
## I am reasonably sure this should be log scaled and am setting it.
## If this is incorrect, set scale='raw'
## Graphing a correlation heatmap.
## Graphing a standard median correlation.
## Performing correlation.
## Graphing a distance heatmap.
## Graphing a standard median distance.
## Performing distance.
## Graphing a PCA plot.
## [1] "There is just one batch in this data."
## Plotting a density plot.
## This data will benefit from being displayed on the log scale.
## If this is not desired, set scale='raw'
## Printing a color to condition legend.
a_metrics$pcaplotb_metrics <- graph_metrics(batch_b)## Graphing number of non-zero genes with respect to CPM by library.
## Graphing library sizes.
## Warning: Ignoring unknown aesthetics: parse
## Graphing a boxplot.
## I am reasonably sure this should be log scaled and am setting it.
## If this is incorrect, set scale='raw'
## Graphing a correlation heatmap.
## Graphing a standard median correlation.
## Performing correlation.
## Graphing a distance heatmap.
## Graphing a standard median distance.
## Performing distance.
## Graphing a PCA plot.
## [1] "There is just one batch in this data."
## Plotting a density plot.
## This data will benefit from being displayed on the log scale.
## If this is not desired, set scale='raw'
## Printing a color to condition legend.
b_metrics$pcaplotIt is pretty obvious that the raw data is a bit jumbled according to PCA. This is not paricularly suprising since we didn’t normalize it at all. Therefore, in this block I will normalize it a few ways and follow up with some visualizations of showing how the apparent relationships change in the data.
## doing nothing to the data except log2 transforming it has a surprisingly large effect
norm_test <- normalize_expt(expt, transform="log2")## This function will replace the expt$expressionset slot with:
## log2(data)
## It backs up the current data into a slot named:
## expt$backup_expressionset. It will also save copies of each step along the way
## in expt$normalized with the corresponding libsizes. Keep the libsizes in mind
## when invoking limma. The appropriate libsize is the non-log(cpm(normalized)).
## This is most likely kept at:
## 'new_expt$normalized$intermediate_counts$normalization$libsizes'
## A copy of this may also be found at:
## new_expt$best_libsize
## Filter is false, this should likely be set to something, good
## choices include cbcb, kofa, pofa (anything but FALSE). If you want this to
## stay FALSE, keep in mind that if other normalizations are performed, then the
## resulting libsizes are likely to be strange (potentially negative!)
## Leaving the data unconverted. It is often advisable to cpm/rpkm
## the data to normalize for sampling differences, keep in mind though that rpkm
## has some annoying biases, and voom() by default does a cpm (though hpgl_voom()
## will try to detect this).
## Leaving the data unnormalized. This is necessary for DESeq, but
## EdgeR/limma might benefit from normalization. Good choices include quantile,
## size-factor, tmm, etc.
## Not correcting the count-data for batch effects. If batch is
## included in EdgerR/limma's model, then this is probably wise; but in extreme
## batch effects this is a good parameter to play with.
## Step 1: not doing count filtering.
## Step 2: not normalizing the data.
## Step 3: not converting the data.
## Step 4: transforming the data with log2.
## transform_counts: Found 923 values equal to 0, adding 1 to the matrix.
## Step 5: not doing batch correction.
l2_metrics <- sm(graph_metrics(norm_test))## a quantile normalization alone affect some, but not all of the data
norm_test <- sm(normalize_expt(expt, norm="quant"))
q_metrics <- sm(graph_metrics(norm_test)) ## q for quant, oh oh oh!## cpm alone brings out some samples, too
norm_test <- sm(normalize_expt(expt, convert="cpm"))
c_metrics <- sm(graph_metrics(norm_test)) ## c for cpm!## low count filtering has some effect, too
norm_test <- sm(normalize_expt(expt, filter="pofa"))
f_metrics <- sm(graph_metrics(norm_test)) ## f for filter!## how about if we mix and match methods?
norm_test <- sm(normalize_expt(expt, transform="log2", convert="cpm", norm="quant", batch="combat_scale", filter=TRUE, batch_step=4, low_to_zero=TRUE))## Some metrics are not very useful on (especially quantile) normalized data
norm_graphs <- sm(graph_metrics(norm_test))Now lets see some of the resulting metrics, in this case I will just compare some pca plots, as they are good at fooling our silly visual brains into seeing patterns.
l2_metrics$pcaplot## Also viewable with plot_pca()$plot
## PCA plots seem (to me) to prefer log2 scale data.
q_metrics$pcaplot## only normalizing on the quantiles leaves the data open to scale effects.
c_metrics$pcaplot## but cpm alone is insufficient
f_metrics$pcaplot## only filtering out low-count genes is helpful as well
norm_graphs$pcaplot## The different batch effect testing methods have a pretty widely ranging effect on the clustering
## play with them by changing the batch= parameter to:
## "limma", "sva", "svaseq", "limmaresid", "ruvg", "combat", combatmod"
knitr::kable(norm_graphs$pcares)| propVar | cumPropVar | cond.R2 | batch.R2 |
|---|---|---|---|
| 82.42 | 82.42 | 99.36 | 0.00 |
| 6.39 | 88.81 | 86.57 | 1.80 |
| 4.02 | 92.83 | 49.64 | 1.55 |
| 2.80 | 95.63 | 8.87 | 17.53 |
| 2.11 | 97.74 | 48.94 | 6.99 |
| 1.27 | 99.01 | 6.16 | 68.83 |
| 0.99 | 100.00 | 0.45 | 3.31 |
## Thus we see a dramatic decrease in variance accounted for
## by batch after applying limma's 'removebatcheffect'
## (see batch.R2 here vs. above)
norm_graphs$smcnorm_graphs$disheat ## svaseq's batch correction seems to draw out the signal quite nicely.## It is worth noting that the wt, early log, thy, replicate c samples are still a bit weird.This is a relatively small data set, so performing some differential expression analyses really should not take long at all.
When performing these analyses with hpgltools, it will attempt to perform similar analyses with limma, edgeR, and DESeq2 via the all_pairwise() function. The most likely argument is ‘model_batch’ which may be used to explicitly include/exclude a batch factor in the model, or ask it to attempt including batch factors from sva/ruv/etc. By default it will attempt to include a column from the experimental design named ‘batch’.
spyogenes_de <- sm(all_pairwise(expt))## Even the lowest correlations are quite high.The result of all_pairwise() is a list of the results from limma, edger, and deseq. In addition, I implemented a very simplistic, differential expression function named ‘basic()’. It also provides some simple measurements of how well the various analyses agree (ergo the black and white heatmap).
Working with these separate tables can be more than a little annoying, combine_de_tables() attempts to simplify this. It will bring together the various tables, and if asked attempt to bring them together into a pretty-ified excel workbook.
all_pairwise() arbitrarily performs all possible pairwise comparisons. This is not necessarily what one actually wishes to see. Therefore, the argument keepers takes a list of contrasts:
my_keepers <- list(
"wt_media" = c("wt_ll_cf", "wt_ll_cg"),
"mga_media" = c("mga_ll_cf", "mga_ll_cg"))In the above example, if the keepers argument to combine_de_tables() is given as my_keepers, then the resulting table will not have the set of 6 possible comparisons, but instead will only have 2 tables named ‘wt_media’ and ‘mga_media’, which if printed to excel will be sheets named accordingly.
Because these tables can get quickly extremely large, the excel workbooks may become too big for the zip(1) program to properly merge. If that happens, one may either remove some(or all) of the plots and/or have it print csv versions of the same tables.
spyogenes_tables <- sm(combine_de_tables(spyogenes_de, excel=FALSE))
summary(spyogenes_tables)## Length Class Mode
## data 6 -none- list
## limma_plots 6 -none- list
## edger_plots 6 -none- list
## deseq_plots 6 -none- list
## comp_plot 0 -none- NULL
## venns 0 -none- list
## de_summary 22 data.frame list
Finally, extract_significant_genes() may choose ‘significant’ genes based upon a few metrics including z-score vs. the distribution of logFC; a logFC cutoff, (ajusted)p-value cutoff, and/or top/bottom n genes.
spyogenes_sig <- sm(extract_significant_genes(spyogenes_tables, excel=FALSE))
knitr::kable(head(spyogenes_sig$limma$ups[[1]]))| seqnames | start | end | width | strand | source | type | id | dbxref | gbkey | name | note | gene | locustag | parent | genesynonym | limma_logfc | deseq_logfc | edger_logfc | limma_adjp | deseq_adjp | edger_adjp | limma_ave | limma_t | limma_p | limma_b | limma_q | deseq_basemean | deseq_lfcse | deseq_stat | deseq_p | deseq_q | edger_logcpm | edger_lr | edger_p | edger_q | basic_nummed | basic_denmed | basic_numvar | basic_denvar | basic_logfc | basic_t | basic_p | basic_adjp | fc_meta | fc_var | fc_varbymed | p_meta | p_var | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| M5005_Spy1734 | NC_007297 | 1696642 | 1696947 | 306 | - | RefSeq | gene | M5005_Spy1734 | GeneID:3571135 | Gene | M5005_Spy_1734 | M5005_Spy1734 | NA | M5005_Spy_1734 | character(0) | character(0) | 4.065 | 2.521 | 4.301 | 0.0418 | 0 | 0 | 0.1655 | 7.563 | 2e-04 | 0.9706 | 0.0301 | 20.69 | 0.4917 | 5.127 | 0 | 1e-04 | 1.011 | 44.13 | 0 | 8.441e-09 | 2.430 | 0.3091 | 3.754e-04 | 3.459e-02 | 2.121 | -16.040 | 3.749e-02 | 6.256e-01 | 3.618 | 4.045e-02 | 1.118e-02 | 5.076e-05 | 7.686e-09 |
| M5005_Spy1735 | NC_007297 | 1696949 | 1698145 | 1197 | - | RefSeq | gene | M5005_Spy1735 | GeneID:3571136 | Gene | speB | M5005_Spy1735 | speB | M5005_Spy_1735 | character(0) | character(0) | 3.684 | 3.145 | 3.696 | 0.0048 | 0 | 0 | 2.2940 | 12.200 | 0e+00 | 3.7300 | 0.0035 | 83.91 | 0.3545 | 8.872 | 0 | 0e+00 | 2.925 | 101.70 | 0 | 6.258e-21 | 4.349 | 1.0820 | 3.503e-02 | 1.818e-02 | 3.266 | -20.020 | 3.833e-03 | 6.256e-01 | 3.355 | 3.406e-01 | 1.015e-01 | 2.150e-06 | 1.387e-11 |
| M5005_Spy1574 | NC_007297 | 1535303 | 1535449 | 147 | - | RefSeq | gene | M5005_Spy1574 | GeneID:3571302 | Gene | M5005_Spy_1574 | M5005_Spy1574 | NA | M5005_Spy_1574 | character(0) | character(0) | 2.484 | 2.264 | 2.472 | 0.0232 | 0 | 0 | 4.5130 | 8.772 | 1e-04 | 2.4760 | 0.0167 | 374.10 | 0.2755 | 8.219 | 0 | 0e+00 | 5.043 | 65.65 | 0 | 2.595e-13 | 5.961 | 3.5370 | 5.230e-03 | 2.524e-02 | 2.424 | -19.640 | 1.122e-02 | 6.256e-01 | 2.513 | 5.196e-01 | 2.068e-01 | 2.012e-05 | 1.215e-09 |
| M5005_Spy1714 | NC_007297 | 1674093 | 1675160 | 1068 | - | RefSeq | gene | M5005_Spy1714 | GeneID:3571154 | Gene | M5005_Spy_1714 | M5005_Spy1714 | NA | M5005_Spy_1714 | character(0) | character(0) | 2.457 | 1.841 | 2.496 | 0.0471 | 0 | 0 | 3.4400 | 6.830 | 3e-04 | 0.8566 | 0.0339 | 359.00 | 0.3323 | 5.540 | 0 | 0e+00 | 4.969 | 36.30 | 0 | 3.259e-07 | 2.325 | 0.7846 | 1.328e-01 | 4.068e-03 | 1.540 | -5.888 | 9.723e-02 | 6.256e-01 | 2.400 | 3.915e-01 | 1.631e-01 | 8.971e-05 | 2.414e-08 |
| M5005_Spy1575 | NC_007297 | 1535635 | 1536831 | 1197 | - | RefSeq | gene | M5005_Spy1575 | GeneID:3571303 | Gene | norA | M5005_Spy1575 | norA | M5005_Spy_1575 | character(0) | character(0) | 2.435 | 2.293 | 2.422 | 0.0048 | 0 | 0 | 6.8270 | 12.140 | 0e+00 | 4.5180 | 0.0035 | 1849.00 | 0.2167 | 10.580 | 0 | 0e+00 | 7.338 | 87.72 | 0 | 4.838e-18 | 8.250 | 5.8810 | 3.683e-02 | 5.610e-02 | 2.368 | -10.990 | 9.467e-03 | 6.256e-01 | 2.430 | 4.826e-01 | 1.986e-01 | 2.494e-06 | 1.867e-11 |
| M5005_Spy1139 | NC_007297 | 1116810 | 1117514 | 705 | - | RefSeq | gene | M5005_Spy1139 | GeneID:3571763 | Gene | nagB | M5005_Spy1139 | nagB | M5005_Spy_1139 | character(0) | character(0) | 2.159 | 2.003 | 2.157 | 0.0232 | 0 | 0 | 4.5960 | 8.961 | 1e-04 | 2.6080 | 0.0167 | 337.70 | 0.2525 | 7.932 | 0 | 0e+00 | 4.895 | 58.11 | 0 | 9.572e-12 | 5.919 | 3.8630 | 6.717e-02 | 2.164e-02 | 2.055 | -9.753 | 2.110e-02 | 6.256e-01 | 2.112 | 5.295e-01 | 2.507e-01 | 1.759e-05 | 9.282e-10 |
Since most of my circos graphs are for pyogenes, it is likely that the defaults are appropriate for this particular organism.
Much(all) of the following is taken from the material in tests/testthat/test_70mga.R
microbe_ids <- as.character(sm(get_microbesonline_ids("pyogenes MGAS5005")))
mgas_df <- sm(get_microbesonline_annotation(microbe_ids[[1]])[[1]])
mgas_df$sysName <- gsub(pattern="Spy_", replacement="Spy", x=mgas_df$sysName)
rownames(mgas_df) <- make.names(mgas_df$sysName, unique=TRUE)
## First make a template configuration
circos_test <- circos_prefix()## This assumes you have a colors.conf in circos/colors/ and fonts.conf in circos/fonts/
## It also assumes you have conf/ideogram.conf, conf/ticks.conf, and conf/housekeeping.conf
## It will write circos/conf/mgas.conf with a reasonable first approximation config file.
## Wrote karyotype to 5
## This should match the karyotype= line in mgas.conf
## Warning in file.symlink(from, to): cannot symlink 'conf/mgas.conf' to './
## mgas.conf', reason 'File exists'
## Fill it in with the data for s.pyogenes
circos_kary <- circos_karyotype("mgas", length=1895017)## Wrote karyotype to circos/conf/karyotypes/mgas.conf
## This should match the karyotype= line in mgas.conf
## Fill in the gene category annotations by gene-strand
circos_plus <- circos_plus_minus(mgas_df, circos_test)## This function assumes an input table including the columns: 'start', 'stop', 'strand', and 'COGFun'
## Writing data file: circos/data/mgas_plus_go.txt with the + strand GO data.
## Writing data file: circos/data/mgas_minus_go.txt with the - strand GO data.
## Wrote the +/- config files. Appending their inclusion to the master file.
## Returning the inner width: 0.84. Use it as the outer for the next ring.
circos_limma_hist <- circos_hist(spyogenes_de$limma$all_tables[[1]], mgas_df, circos_test, outer=circos_plus)## Writing data file: circos/data/mgas_logFC_hist.txt with the logFC column.
circos_deseq_hist <- circos_hist(spyogenes_de$deseq$all_tables[[1]], mgas_df, circos_test, outer=circos_limma_hist)## Writing data file: circos/data/mgas_logFC_hist.txt with the logFC column.
circos_edger_hist <- circos_hist(spyogenes_de$edger$all_tables[[1]], mgas_df, circos_test, outer=circos_deseq_hist)## Writing data file: circos/data/mgas_logFC_hist.txt with the logFC column.
circos_suffix(cfgout=circos_test)
circos_made <- circos_make(target="mgas")
## For some reason this fails weirdly when not run interactively.circos result
GenoplotR is new to me, but it seems to work?
genoplot_chromosome()## Warning in strings[i] <- downloaded: number of items to replace is not a
## multiple of replacement length
## Warning: closing unused connection 5 (circos/Makefile)
pander::pander(sessionInfo())R version 3.3.2 (2016-10-31)
**Platform:** x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_US.utf8, LC_NUMERIC=C, LC_TIME=en_US.utf8, LC_COLLATE=en_US.utf8, LC_MONETARY=en_US.utf8, LC_MESSAGES=en_US.utf8, LC_PAPER=en_US.utf8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.utf8 and LC_IDENTIFICATION=C
attached base packages: methods, stats, graphics, grDevices, utils, datasets and base
other attached packages: hpgltools(v.2016.02)
loaded via a namespace (and not attached): Biobase(v.2.32.0), RMySQL(v.0.10.9), edgeR(v.3.14.0), splines(v.3.3.2), foreach(v.1.4.3), gtools(v.3.5.0), Formula(v.1.2-1), assertthat(v.0.1), highr(v.0.6), stats4(v.3.3.2), latticeExtra(v.0.6-28), pander(v.0.6.0), robustbase(v.0.92-6), yaml(v.2.1.14), ggrepel(v.0.6.3), RSQLite(v.1.0.0), backports(v.1.0.4), lattice(v.0.20-34), limma(v.3.28.21), chron(v.2.3-47), digest(v.0.6.10), GenomicRanges(v.1.24.3), RColorBrewer(v.1.1-2), XVector(v.0.12.1), colorspace(v.1.3-1), htmltools(v.0.3.5), preprocessCore(v.1.34.0), Matrix(v.1.2-7.1), plyr(v.1.8.4), DESeq2(v.1.12.4), XML(v.3.98-1.5), devtools(v.1.12.0), genefilter(v.1.54.2), zlibbioc(v.1.18.0), xtable(v.1.8-2), corpcor(v.1.6.8), scales(v.0.4.1), genoPlotR(v.0.8.4), openxlsx(v.3.0.0), BiocParallel(v.1.6.6), htmlTable(v.1.7), tibble(v.1.2), annotate(v.1.50.1), mgcv(v.1.8-16), IRanges(v.2.6.1), ggplot2(v.2.2.0), withr(v.1.0.2), SummarizedExperiment(v.1.2.3), nnet(v.7.3-12), BiocGenerics(v.0.18.0), lazyeval(v.0.2.0), survival(v.2.40-1), magrittr(v.1.5), crayon(v.1.3.2), memoise(v.1.0.0), evaluate(v.0.10), doParallel(v.1.0.10), nlme(v.3.1-128), foreign(v.0.8-67), tools(v.3.3.2), data.table(v.1.9.6), matrixStats(v.0.51.0), stringr(v.1.1.0), S4Vectors(v.0.10.3), locfit(v.1.5-9.1), munsell(v.0.4.3), cluster(v.2.0.5), AnnotationDbi(v.1.34.4), ade4(v.1.7-4), compiler(v.3.3.2), GenomeInfoDb(v.1.8.7), grid(v.3.3.2), RCurl(v.1.95-4.8), iterators(v.1.0.8), bitops(v.1.0-6), labeling(v.0.3), rmarkdown(v.1.2), testthat(v.1.0.2), gtable(v.0.2.0), codetools(v.0.2-15), DBI(v.0.5-1), reshape2(v.1.4.2), R6(v.2.2.0), gridExtra(v.2.2.1), knitr(v.1.15.1), Hmisc(v.4.0-0), rprojroot(v.1.1), KernSmooth(v.2.23-15), stringi(v.1.1.2), parallel(v.3.3.2), sva(v.3.20.0), Rcpp(v.0.12.8), geneplotter(v.1.50.0), rpart(v.4.1-10), acepack(v.1.4.1) and DEoptimR(v.1.0-8)